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Electronic document filing system 

The present invention which is the subject of this application 
relates to a method and system for sorting and filing e-mails or 
other electronic documents. 

Typically, electronic documents need to be filed in a system 
memory such as that of a Personal Computer, in a manner which 
allows the same to be identified and retrieved. Conventionally a 
multilayered, or hierarchical storage structure is used. 

However, with a complex hierarchical filing structure it can be 
time consuming to traverse, scroll and attempt to find the 
appropriate file folder for the electronic document. Currently, 
two facilities assist this process in that the navigated structure 
can be partially expanded, and/or a history of most recently 
accessed folders is available. 

However, with disparate sources of electronic documents 
coming in to the system, the history is only partially valuable, 
whil e~f he" "ex pa iid^d~hi e 

the structure while requiring substantial scrolling through the 
structure by the user. 

Thus, while both facilities may be of limited use, they can still 
entail a significant amount of time being required to be spent by 
the user when trying to file or retrieve an electronic document. 

The aim of this invention is to provide an analysis of an 
electronic document attribute or attributes such as the header, 
audience, sender and/or content and therefore provide a 
suggested Jocation_orUocations-in_-the_-S.torage. system, in -which to . 
file it. 



In a first aspect of the invention there is provided a method of 
storage and/or filing of electronic documents wherein said 
method includes the compilation of a list of possible filing 
locations within a document storage system, assessing each 
location and allocating a weighting value to each location with 
respect to other locations and upon receipt of an electronic 
document assessing at least one attribute of the document and, 
with reference to the weighted values of the selectable locations 
for storage, selecting to locate said electronic document in at 
least one of the storage locations. 

Typically, for each incoming document, a correlation is made 
against a database representative of the filing properties of the 
storage locations of the filing system which is being used to 
store those documents. 

Preferably, a certain number, say 5-10, of the best correlations 
can be presented, such that if a correlation is matched for an 
incoming document, that document can be stored in a storage 
location automatically or by instant selection without the need 
to traverse or descend into the filing hierarchy. Thus, 
considerable savings in time and a reduction of the frustration 
caused to the user is achieved by this invention. 

If, upon analysis of an incoming document, a matching 
correlation is not identified such that none of the "shortcut" 
storage locations are relevant, then the document can be stored 
in a storage location using the conventional method of 
document filing. 

Typically, as new documents are added into the filing system, 
the-database-of -filing— properties - used -for -the- correlation" and 
analysis can be adapted to reflect the documents received in 
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order to ensure statistically significant correlating features are 
used. 

Typically the attributes of the document which are assessed can 
be set by the system and/or user and some attributes which it is 
submitted can be usefully assessed are any or any combination 
of the following; document Sender's name, Senders company. 
Target audience. Header text match against folder titles, core 
text correlation against folder titles, Keyword extraction from 
filed document, and/or Header text correlations against filed 
documents. However this list is not intended to be exhaustive 
and should not be interpreted as limiting the parameters which 
can be selected. 

Clearly some attributes are more easily assessed and detected 
than others. Furthermore in the analysis of certain attributes 
some level of statistical significance can be attached to the 
results so that they are meaningful. For example; a high 
correlation of the word "the" might occur, yet it would not be a 
statistically significant differentiator among the file folders. 

This is why a companion database associated with the file 
structure is preferred. This would hold, for example, 

statistically differentiating key words associated with a particular 
folder and only these keywords would be used to correlate 
against the e-mail to be filed. Thus affording a reduction in 
computational effort over systems that would otherwise have to 
perform detailed correlations against the actual folder contents 
as each new item arrives. 

A specific example of the invention is now described with 
-reference- to— the— accompanying- diagram,- -which - -illustrates in- 
schematic fashion, an electronic document filing system, in this 
case an e-mail filing system, in accordance with the invention. 
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In this case two sets of files are identified, a first, file 1 relates 
to the attribute of companies and the second relates to the 
attribute "technical". Each of the files is split into a series of 
folders, each having an identified attribute within that file such 
as in the case of the companies file, "retailers", "financial" and 
"government". Each of these may have further folders as 
indicated. 

1 -Companies 

(i) -Retailers 

- Mr Smiths Shop 

e: "blah, blah" from : msmith@niyshop.com... 
e: ... 

- Confederation of retailers 

e: "Meeting 27^*"..." to : board@confed.org 
e: ... 

+ (ii) Financial 

+ (iii) Government 
2-Technical 

(i)-Distribution 

e: Latest shipping uses ABCD technology 
e: Company X designs ABCD widget 
e: re: Company X designs ABCD widget 

Thus with the relevant attributes identified within the database 
for which the analysis of incoming documents is to occur, then 
in this example, process for analysis of incoming documents 
identifies a high statistically significant correlation of the term 
<from : > as the address of any incoming e-mails. Thus with the 
file and folders identified using the string 
"Companies \ Retailers VMr — Smiths — Shop" — folder- an -e-mail 
identified as <from xxx@, smiths hop. co.uk > would be identified 
and routed quickly to the appropriate folder. 
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Similarly, replies to and messages sent to <to :> 
board@confed.org would correlate for the 

"Compames\Retailers\Confederation of retailers" folder. 

Furthermore, if a significant number of e-mails with such a 
source address were already filed within that folder, that address 
would be noted as a significant attribute for that folder and 
stored within the database for subsequent use by the correlator. 

The keywords, "Company X" and "ABCD" could be extracted 
from the headers of the e-mail for the folder 
"Technical\Distribution" and stored within the correlation 
database. 

Typically as the filing system grows in complexity and the 
diversity of the content filed increases, the adaptive value of the 
system will become more apparent over simple explicit filter 
based sorting. 

In one enhancement of the system, a degree of user "bias" can 
be specified for a folder if desired. For example, even though a 
high degree of correlation may be attributable to say an e-mail 
address, a specific keyword may be more important so that for 
example the user receives a relatively large number of e-mails 
from company X on technology Y, but rather than file the e- 
mails in a folder relating to the Company X they may wish to 
file the same in the folder relating to the technology Y. 

The accompanying Figure provides one arrangement of the 
invention in schematic fashion and refers to the example 
deseribed-previously. - — - 



' 4 



6 

Thus, in accordance with the system in this example in the e- 
mail inbox 2 there are stored three e-mails as indicated. This 
information is passed into a feature or attribute correlator 4. 
The settings for this correlator are set by the system and/or user 
and are indicated by reference numeral 6. When the correlation 
between the set attributes and the attributes of the e-mails from 
the inbox is completed, the information is referred to a 
weighting and sorting processor 8 which includes data relating 
to the particular weighting of each attribute with respect to the 
other attributes. At the same time the user can receive an 
indication 10 of those files and folders which are provided as 
"shortcut" locations. 

When the new e-mails have passed through the correlation and 
weighting process they can then be filed and stored 12, if 
appropriate, in one of the shortcut storage locations or 
alternatively in the hierarchical storage system. 

As new e-mails are filed, that folder's attributes can be 
recalculated and the database updated. 

It is also preferred that at time intervals the whole system is 
reviewed 14 to maintain statistical correlation of the attributes 
set and the weighting of the same in response to the documents 
which have been received and stored. 



INBOX 

E: to: board @ confed.org "Response to latest press ..." 

E: to: joe@work from : msniith@shop.com "supply of ..." 

E: to: joe@work from : random@newco "Our ABCD is the best. 
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Feature Correlator 



DATABASE 

< features > 

<from:> 
<to:> 

<text corTelations> 

<header-folder> 
<header-keywords>^ 
<etc.> 

Significant features by folder : 
<Mr Smith> 

<Smith, smith.com . . .> 
<Technical> 

<ABCD, MP3, ...> 

User modified weightings by folder : 



Weighting and sorting 




User displayed shortcut list : 
<Mr Smiths> 
<Confederation of > 
<Widgets ABCD> 

Other ... 



J-^iling 

Adaptive update of that folders feature 
correlatton based'on new e-mail 



User selection 
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Background 

Occasional integrity, check across all 
folders to maintain statistical 
correlation over time (key feature 
based separation of folders) 



Figure 1 



